Novel sofosbuvir derivatives against SARS-CoV-2 RNA-dependent RNA polymerase: an in silico perspective

The human coronavirus, SARS-CoV-2, had a negative impact on both the economy and human health, and the emerging resistant variants are an ongoing threat. One essential protein to target to prevent virus replication is the viral RNA-dependent RNA polymerase (RdRp). Sofosbuvir, a uridine nucleotide analog that potently inhibits viral polymerase, has been found to help treat SARS-CoV-2 patients. This work combines molecular docking and dynamics simulation (MDS) to test 14 sofosbuvir-based modifications against SARS-CoV-2 RdRp. The results reveal comparable (slightly better) average binding affinity of five modifications (compounds 3, 4, 11, 12, and 14) to the parent molecule, sofosbuvir. Compounds 3 and 4 show the best average binding affinities against SARS-CoV-2 RdRp (− 16.28 ± 5.69 and − 16.25 ± 5.78 kcal/mol average binding energy compared to − 16.20 ± 6.35 kcal/mol for sofosbuvir) calculated by Molecular Mechanics Generalized Born Surface Area (MM-GBSA) after MDS. The present study proposes compounds 3 and 4 as potential SARS-CoV-2 RdRp blockers, although this has yet to be proven experimentally.

SARS-CoV-2 has infected an estimated 769 million people, with about 6.9 million deaths since its outbreak in 2019 (https:// www.who.int/ emerg encies/ disea ses/ novel-coron avirus-2019).The disease symptoms can range from mild symptoms such as cough, fever, fatigue, loss of smell, and loss of taste to more severe complications such as severe pneumonia, dyspnea, and organ dysfunction 1 .SARS-CoV-2 (also known as Covid- 19) is a singlestranded RNA virus in the Coronaviridae family 2 .SARS-CoV-2 has many known variants, such as alpha, beta, gamma, delta, and omicron variant 3 .In addition, other variants of interest include, epsilon, zeta, eta, theta, iota, kappa, lambda, and mu.Although disease mortality and morbidity have decreased with the development of vaccination, the emergence of different variants able to overcome the vaccines has caused recent waves and breakthrough infections 4 .
Several viral targets have been used in COVID-19 drug development, and these encompass crucial stages of the virus replication cycle [5][6][7][8][9] .For example, the spike protein is necessary during viral recognition and host-cell entry, the main protease (M pro ) during the proteolytic activation stage, and RNA-dependent RNA polymerase (RdRp) during the transcription stage 5 .Of these targets, RdRp has been shown to be the most appealing for drug discovery as it is a very conservative non-structural protein functions as RNA polymerase, which either copies the RNA of the virus for replication or produces a messenger RNA 10 .The consecutive aspartates active site were found and targeted in many viral RNA polymerases.Therefore, inhibition of RdRp prevents these essential steps in the lifecycle of SARS-CoV-2 and, thus, prevents replication and propagation of the virus.
Sofosbuvir is a uridine nucleotide analog that potently inhibits viral polymerases.It is an anti-RdRp that is found useful in treating SARS-CoV-2 patients 11 .Sofosbuvir is a protide prodrug that is hydrolyzed by liver enzymes after absorption to form the monophosphate uridine analog, which is further phosphorylated to form the active triphosphate form.Sofosbuvir was approved by FDA against hepatitis C Virus (HCV) in the year 2013.It is administered orally in combination with ribavirin to treat HCV of genotypes 2 and 3 12 .We choose this drug as it was deeply studied with millions of HCV patients worldwide during the past 10 years.

www.nature.com/scientificreports/
The SARS-CoV-2 RNA genome frequently mutates due to its rapid replication and lack of error proofreading by the viral RNA polymerase, which results in the emergence of drug resistance (e.g., the P323L mutation where the substitution of proline [Pro] to leucine [Leu] occurred at amino acid position 323 of RdRp) 13,14 .Therefore, continuous development of drugs is required to overcome emerging viral resistance.
In this study, our strategy was to carry out rational modifications to the general features of the sofosbuvir scaffold and to model the effect of such alterations on RdRp binding.Modifications were categorized into two groups based on the site of transformation: sugar modifications and uracil modifications.Sugar modifications include studying the effect of extending position 3 with the ethyl amino group or changing position 3 to the fluoro group (structures 1 and 5, Fig. 1) 15 .To mimic the binding of the triphosphate, an amino group, or polyhydroxylated-substituted amino groups, substitutes the hydroxyl of the ribosyl 5-position.The uracil modifications of sofosbuvir include attaching the ethylamino group at uracil N-3 and substituting the carbonyl oxygen at position 4 with sulphur or amino or aminoethyl amino groups (structures 6 -9, Fig. 1) 16 .Other modifications involve substitutions at the carbonyl oxygen at position 2 with thio, benzylthio, and dihydroxy benzylthio groups (structures 10 -12, Fig. 1).The effect of attaching hydrazine or phenylhydrazine at position 6 was also explored (structures 13 and 14, Fig. 1) 17 .To evaluate the proposed modifications, all analogs were docked in the catalytic site of RdRp, and the molecular dynamics of the best hits were determined to study the stability of the analogs in the gorge of the enzyme.
Molecular dynamics simulation (MDS) was successful in repurposing and prediction of new therapeutics against SARS-CoV-2 proteins 18,19 .The RdRp was previously studied using the MDS technique and was successful in predicting the potential usefulness of many drugs such as remdesivir against COVID-19 pandemic [20][21][22] .In the current study we use the same technique to test new scaffold molecules based on sofosbuvir.

Protein and ligands preparation
As previously reported, the PDB: ID 7BV2 was downloaded from the Protein Data Bank (www.pdb.org ) 23 and was then prepared by removing unwanted molecules and adding missing Hydrogen atoms using PyMOL software 24,25 .As a result, the prepared protein target contained only nonstructural protein 12, nonstructural protein 7, and nonstructural protein 8 as a complex, the active complex of the viral polymerase 26 .We choose this structure as it is of moderate resolution (2.5 Å) and containing the polymerase in conjunction with nsp7 and nsp12, representing the complete system.
The drug was modified and optimized using SCIGRESS 3.2 software 27 .First, the structures were minimized using the classical MM3 force field, followed by the semiempirical parameterization method (PM6) in water 28 .Next, the vibrations were calculated at the same level of computation to ensure their stability (not a transition www.nature.com/scientificreports/state).Finally, structures were optimized using the quantum mechanical method, which is based on density functional theory (DFT), using the B3LYP functional 29,30 .Again, geometry was judged by their infra-red spectrum calculated at the same level of theory.We ensure no negative vibrations (transition states) are found in the optimized compounds.

Molecular docking
The clustered trajectories (three clusters) obtained from the previous study were used as possible protein conformations for docking 24 .The clustering was performed using Chimera using default parameters.Prior to the simulation any water, ions, or ligands were removed.AutoDock Vina (7) was used for the flexible docking of the modified compounds with each cluster representative.The compounds were prepared by adding Gasteiger charges and converted to PDBQT file format using AutoDock Tools software 31 .The protein was also prepared by adding partial charges and converted to PDBQT file format.The grid box dimensions were set to 40 Å in each direction for each docking experiment to cover the active site residues.The box centers for the docking with cluster 1, cluster 2, and cluster 3 representative structures (after clustering the trajectories of the protein target) were set to (− 5.194, − 9.118, 0.000), (− 6.871, − 10.988, − 0.146), and (− 5.558, − 9.933, − 0.090), respectively.According to the calculated binding affinity, the best complex (having the lowest binding energy values) and the parent compound, sofosbuvir, were subjected to another MDS run of 100 ns.We compared the binding behaviour of the tested compounds to the reference drug (sofosbuvir) using the same protocol.

Molecular dynamic simulation (MDS)
The files were prepared for MDS using the CHARMM-GUI web server 32,33 (https:// www.charmm-gui.org/).The system was solvated using the TIP3P water model in a cubic box with a padding of 10 Å and the box size was 121 × 121 × 121 Å 3 then the systems were neutralized by adding 0.154 M NaCl which resulted in a total of around 166,000 Atoms.The temperature was set to the physiological value of 37 °C (310 K), and the CHARMM36 force field was utilized.CHARMM36 accurately predict protein parameters and is utilized by many MDS software when dealing with proteins.It could predict backbone & side-chain scalar couplings, residual dipolar couplings, and relaxation & side-chain order parameters 34 .Nanoscale molecular dynamics (NAMD) 2.13 software was used to perform the MDS calculation 35 .Before the production run, the minimization and equilibration steps were completed.The system was minimized for 10,000 steps and then equilibrated for 0.25 ns in an ensemble of a constant number of atoms, a constant volume, and a constant temperature (NVT).The temperature was maintained using Langevin dynamics 35,36 .Then, the production run was started for 100 ns in the NVT ensemble.Finally, the MDS trajectories were analyzed using in-house codes and visualizing molecular dynamics (VMD) 1.9.3 software 37 .In addition, the trajectory of the best two complexes and the reference compound were clustered using TTClust and representative frames for each cluster were obtained.Protein-Ligand Interaction Profiler (PLIP) was used with these frames to detect the interacting amino acids [38][39][40] .

Binding energy calculations during the molecular dynamics simulation for the best hits
MMPBSA.py module implemented in the AmberTools 21 package was utilized to determine the binding affinity for the best-hit compounds (3 and 4) and the positive control sofosbuvir 41,42 .In addition, decomposition analysis was calculated to determine the contribution of amino acids to total binding energy.The salt concentration and solvation method (igb) were set to 0.154 M and 5, respectively.The internal and external dielectric constants were set to 1.0 and 80.0, respectively, and other options were assigned to the default values.The molecular mechanics-generalized Born surface area (MM-GBSA) approach is depicted in Eq. (1).
where < > represents the average of the enclosed free energies of complex, receptor, and ligand over the frames used in the calculation.In our approach, we used the whole trajectory (1000 frames).Different energy terms can be calculated according to Eqs. ( 2) -( 6) as follows: where ∆H is the enthalpy calculated from gas-phase energy (E gas ) and solvation-free energy (E sol ), − T∆S is the entropy contribution to the free binding energy, and E gas is composed of electrostatic term E ele and van der Waals term E vdW .E sol can be calculated from the polar solvation energy (E GB ) and nonpolar solvation energy (E SA ), which are estimated from the solvent-accessible surface area.The entropy contribution was calculated using the interaction entropy method using the last 90 ns 43,44 . (1)

Results and discussion
Sofosbuvir is a uridine nucleotide analog that potently inhibits HCV NS5B polymerase.Sofosbuvir is a protide prodrug that is hydrolyzed by liver enzymes after absorption to form the monophosphate uridine analog, which is further phosphorylated to form the active triphosphate form.Sofosbuvir received FDA approval for use against HCV in 2013; it is administered orally in combination with ribavirin for the treatment of genotypes 2 and 3 45 .

Molecular docking
The average docking scores of the suggested modifications of sofosbuvir (Fig. 1) and the positive control compound (sofosbuvir triphosphate), calculated with AutoDock Vina, are presented in the bar graphs in Fig. 2. The docking experiments were performed with the three different conformations of the SARS-CoV-2 RdRp after 100 ns MDS 24 .The error bars represent the standard deviations.As shown in Fig. 2, the fourteen compounds showed average binding energies to the SARS-CoV-2 RdRp active site comparable to that of the positive control and parent compound, sofosbuvir.Additionally, five compounds (3, 4, 11, 12, and 14) showed better average values than sofosbuvir in binding SARS-CoV-2 RdRp, while compounds 3 and 4 were the best hits in binding the viral polymerase.Table 1 summarizes the formed interactions that were established upon docking these fourteen compounds in the active site of the SARS-CoV-2 RdRp.Two types of interaction stabilized the complexes; Hydrogen bonding (H-bonds) and hydrophobic interactions.On average, 2.9 ± 1.8 H-bonds and 1.8 ± 0.8 hydrophobic contacts were established upon docking the fourteen ligands against the SARS-CoV-2 RdRp active site.In contrast, five H-bonds were formed for sofosbuvir, and only one hydrophobic contact was established upon docking.For the best two compounds, 3 and 4, seven interactions were established between the ligands and the protein residues.Additionally, the following six residues were the most significant contributors in the fourteen compounds against the SARS-CoV-2 RdRp active site: D618 (8), D761 (5), A762 (7), W800 (5), E811 (4), and C813 (4), with the numbers in brackets, indicating the total number of interactions (as shown in Table 1, bolded residues).All these residues were found in the active site pocket of the RdRp, as previously reported 46 .

Molecular dynamics of the sofosbuvir-and the best compounds -RdRp complexes
MDS was used to assess protein system stability and to assess the binding affinities during system equilibration 47 .We simulated the protein-ligand complexes for both sofosbuvir (positive control) and compounds 3 and 4 for 100 ns MDS each.Figure 3 shows the root-mean-square deviation (RMSD) in Å (A), the radius of gyration (RoG) in Å (B), the surface accessible surface area (SASA) in Å 2 (C), and the total number of formed H-bonds (D) versus the simulation time in nanoseconds.Sofosbuvir-RdRp complexes are shown in red, while compounds 3 and 4-RdRp complexes are shown in gray and blue, respectively.Based on the RMSD curves (Fig. 3A), all the complexes were stable, and the systems were slowly equilibrated during the simulation period with RMSD values between 2 and 3.2 Å during the second half of the simulation.Additionally, the systems were stable during the simulation with RoG values around 31 Å (Fig. 3B), SASA values between 44,000 and 47,000 Å 2 (Fig. 3C), and total system H-bonds between 1600 and 1750 (Fig. 3D).The per-residue root-mean-square fluctuations  3E.The active site residues, which are marked on the RMSF curve, were stable (RMSF > 2Å).In general, the RMSF showed stable protein with minimal high fluctuation regions (RMSF < 3Å), as reported previously 48 .This indicates the stability of the complexes during the simulation period.The ligand-RMSD (Fig. 3F) also indicate stable complexes during the simulation period.The Fig. 4 shows the results of trajectory clustering and the interacting amino acids as detected by PLIP.As can be seen in the representative frames of sofosbuvir, D760 and D761 are nearly common in all representative frames and forming three to four H-bonds.Moreover, K621 forms a salt bridge with sofosbuvir.For compound 3, the first two cluster representatives show V495 and Y516 forming one hydrophobic contact and one H-bond, respectively.In addition, the Fluorine atom form a halogen bond with Q492.For compound 4, there is no common amino acids in the representative frames, however, K545 forms a π-cation interaction.
The binding energies of the top hits' (compounds 3 and 4) complexes with RdRp were compared to the sofosbuvir-RdRp complexes during the simulation period using the molecular mechanics-generalized Born surface area (MM-GBSA) method.All the equilibrated region of trajectories (the last 90 ns) were used for the MM-GBSA calculations (stride = 1), and the values are listed in Table 2.The residual contribution to binding is listed in the table, where highly contributed residues (binding energy > -0.85 kcal/mol) are shown in bold.In all complexes, the ligand (LIG) was the most contributed element for the binding with values between − 7.78 and − 10.10 kcal/mol.The average total binding energy for the compounds 3-and 4-RdRp complexes (− 16.28 ± 5.69 and − 16.25 ± 5.78 kcal/mol) is comparable to that of the average binding energy of sofosbuvir-RdRp complexes (− 16.20 ± 6.35 kcal/ mol).However, the entropy contribution of the reference compound is higher than that of the two proposed compounds.This indicates the feasibility of using these candidates as anti-SARS-CoV-2 RdRp after toxicity testing and experimental validation.

ADMET prediction
With the help of the pkCSM web server 49 , the ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) properties of the three compounds were predicted.Table 3 shows the predicted values for each property.The three compounds show similar water solubility (log(mol/L)), with compound 4 being the more soluble one (− 2.84 log(mol/L)), followed by compound 3 (− 2.524 log(mol/L)), and then sofosbuvir (− 2.271 log(mol/L)), which are in the same region as other druggable compounds 50 .On the other hand, the Caco2 permeability shows a similar trend, with compound 4 as the most permeable (58.2%), followed by compound 3 (49.9%),and then sofosbuvir (28.7%).
The fraction unbound indicates the predicted fraction of compound that will be unbound to serum proteins.High unbound fraction, low blood-brain barrier (BBB) permeability (log BBB), and low central nervous system (CNS) permeability (log CNS) indicate a good distribution and low permeability to the BBB and CNS.All three compounds show low BBB permeability as they have values < − 1 and low CNS permeability (values < − 3).However, the fraction unbound shows a better value for the reference compound in comparison with the two compounds.
Inhibitors of Cytochrome P450 can activate the drug metabolism and, therefore, can remove the compound from the market.All compounds were predicted to not act as inhibitors of different isoforms (CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4).For excretion, the model predicts that all compounds are not substrates for renal organic cation transporter 2. The interaction with this transporter helps in the clearance of the compound and may produce adverse interactions; therefore, a negative prediction is considered good.
Finally, four indicators are used to predict the toxicity of the compounds.AMES toxicity is a test that indicates whether the compound is a carcinogen.Inhibition of hERG I/II is the principal cause of fatal ventricular arrhythmia and has resulted in the withdrawal of many substances.As its name implies, hepatotoxicity indicates whether the compound may disrupt the normal function of the liver.The server predicted that all compounds do not act as carcinogens or inhibitors for hERG I/II.On the other hand, the two compounds were predicted to have a hepatotoxic effect, while the reference does not.

Conclusion
SARS-CoV-2 RdRp is a well-known viral protein target for drug designers.It has been studied for possible inhibition by previously approved drugs, such as remdesivir and sofosbuvir.In this study, we tested 14 novel modifications of sofosbuvir against the RdRp of SARS-CoV-2 in silico.Our docking study revealed the potential of five modifications (compounds 3, 4, 11, 12, and 14), as they obtained binding comparable to that of the parent compound, sofosbuvir, against the RdRp active site.Additionally, the molecular dynamics simulation revealed the effectiveness of compounds 3 and 4 as the best RdRp binders.It form stable complexes with SARS-CoV-2 RdRp mainly through V415, V477, R489, and Q493 with binding energies of − 16.28 and − 16.25 for compounds 3 and 4, respectively.This study paves the way for the laboratory verification of for novel SARS-CoV-2 RdRp blockers.Table 2.The MM-GBSA calculations for sofosbuvir and the best hits (compounds 3 and 4) complexes with SARS-CoV-2 RdRp after 100 ns MDS.Bold residues indicate the most contributed residues (binding energy < − 0.85 kcal/mol), while red-colored residues have a negative contribution to the binding (positive binding energies).The average binding free energies and their individual terms are shown at the bottom for each complex, along with its standard deviations.Entropy (− TΔS) was calculated using the interaction entropy using the last 90 ns of each trajectory.

Figure 1 .
Figure 1.Sugar and uracil modifications to the sofosbuvir drug.

Figure 3 .
Figure 3.The molecular dynamics simulation analysis of the three best-hit complexes (4-RdRp) and the three sofosbuvir-RdRp complexes.(A), (B), (C), and (D) are the root-mean-square deviation (RMSD), the radius of gyration (RoG), the surface accessible surface area (SASA), and the total number of formed H-bonds versus the simulation time in ns.(E) The pre-residue root-mean-square fluctuations (RMSF) of the six complexes during the simulation period.The position of active site residues (D760 and D761) is marked in green.(F) The ligand RMSD versus the simulation time in ns.

Table 1 .
Detailed interactions established upon docking the sofosbuvir and its derivatives (1-14) against SARS-CoV-2 RdRp in complex with NSP8 and NSP12 retrieved from PLIP webserver and PyMOL software.Bold residues indicate the residues most reported to form interactions with the ligands.

Table 3 .
The ADMET predictions performed by pkCSM web server.